Ultra accurate collaborative information filtering via directed user similarity
نویسندگان
چکیده
A key challenge of the collaborative filtering (CF) information filtering is how to obtain the reliable and accurate results with the help of peers’ recommendation. Since the similarities from small-degree users to large-degree users would be larger than the ones opposite direction, the large-degree users’ selections are recommended extensively by the traditional second-order CF algorithms. By considering the users’ similarity direction and the second-order correlations to depress the influence of mainstream preferences, we present the directed second-order CF (HDCF) algorithm specifically to address the challenge of accuracy and diversity of the CF algorithm. The numerical results for two benchmark data sets, MovieLens and Netflix, show that the accuracy of the new algorithm outperforms the state-of-the-art CF algorithms. Comparing with the CF algorithm based on random-walks proposed in the Ref.7, the average ranking score could reach 0.0767 and 0.0402, which is enhanced by 27.3% and 19.1% for MovieLens and Netflix respectively. In addition, the diversity, precision and recall are also enhanced greatly. Without relying on any context-specific information, tuning the similarity direction of CF algorithms could obtain accurate and diverse recommendations. This work suggests that the user similarity direction is an important factor to improve the personalized recommendation performance. Introduction. – With the rocketing development of the Internet, we are confronted with the problem of information overload [1, 2]. In order to break through this dilemma, various recommender algorithms [3–7,9,10], which attempts to predict users’ interests by analyzing their historical activities, have been proposed. So far, the collaborative filtering (CF) algorithm [11,12] has been one of the successful recommendation algorithms, which is designed based on the assumption that users with similar preferences will rate similar objects. When predicting the potential interests of a given user, the CF algorithm firstly identifies the neighborhood of each user by calculating similarities between all pairs of users, and then makes recommendations based on the neighbors’ selections. It is well known that the most important ingredient in determining the performance of the CF algorithm is how to precisely define the similarities between each pair of users [13,14]. Based on the user-object bipartite network, the cosine similarity [15] is the most widely used index to quantify the proximity of users’ tastes. In addition, Sar(a)E-mail:[email protected] wal et al. [16] proposed the item-based CF algorithm by comparing different items. Deshpande and Karypis [17] proposed the item-based top-N CF algorithm, in which items were ranked according to the frequency of appearing in the set of similar items and the top-N ranked items were returned. Luo et al. [18] introduced the concepts of local and global user similarity based on surprisal-based vector similarity and the concepts of maximum distance in graph theory. Recently, some physical dynamics, such as random walks [6,7] and heat conduction [19], have found their applications in user or item similarity measurement to generate recommendation algorithms. Liu et al. [7] embedded the random-walks process into the CF algorithm to calculate the user similarity and found that the random-walkbased CF algorithm had remarkable accuracy. By taking into account the second-order correlation of the objects and users, Zhou et al [20] proposed improved CF algorithms by depressing the influence of mainstream preferences. The simulation results show that both accuracy and diversity of the improved CF algorithms could be en-
منابع مشابه
A New Similarity Measure Based on Item Proximity and Closeness for Collaborative Filtering Recommendation
Recommender systems utilize information retrieval and machine learning techniques for filtering information and can predict whether a user would like an unseen item. User similarity measurement plays an important role in collaborative filtering based recommender systems. In order to improve accuracy of traditional user based collaborative filtering techniques under new user cold-start problem a...
متن کاملیک سامانه توصیهگر ترکیبی با استفاده از اعتماد و خوشهبندی دوجهته بهمنظور افزایش کارایی پالایشگروهی
In the present era, the amount of information grows exponentially. So, finding the required information among the mass of information has become a major challenge. The success of e-commerce systems and online business transactions depend greatly on the effective design of products recommender mechanism. Providing high quality recommendations is important for e-commerce systems to assist users i...
متن کاملA NOVEL FUZZY-BASED SIMILARITY MEASURE FOR COLLABORATIVE FILTERING TO ALLEVIATE THE SPARSITY PROBLEM
Memory-based collaborative filtering is the most popular approach to build recommender systems. Despite its success in many applications, it still suffers from several major limitations, including data sparsity. Sparse data affect the quality of the user similarity measurement and consequently the quality of the recommender system. In this paper, we propose a novel user similarity measure based...
متن کاملUse of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems
One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...
متن کاملSolving the accuracy-diversity dilemma via directed random walks
Random walks have been successfully used to measure user or object similarities in collaborative filtering (CF) recommender systems, which is of high accuracy but low diversity. A key challenge of a CF system is that the reliably accurate results are obtained with the help of peers' recommendation, but the most useful individual recommendations are hard to be found among diverse niche objects. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1407.7049 شماره
صفحات -
تاریخ انتشار 2014